145 research outputs found
Implicit Langevin Algorithms for Sampling From Log-concave Densities
For sampling from a log-concave density, we study implicit integrators
resulting from -method discretization of the overdamped Langevin
diffusion stochastic differential equation. Theoretical and algorithmic
properties of the resulting sampling methods for and a
range of step sizes are established. Our results generalize and extend prior
works in several directions. In particular, for , we prove
geometric ergodicity and stability of the resulting methods for all step sizes.
We show that obtaining subsequent samples amounts to solving a strongly-convex
optimization problem, which is readily achievable using one of numerous
existing methods. Numerical examples supporting our theoretical analysis are
also presented
Monte Carlo Estimation of the Density of the Sum of Dependent Random Variables
We study an unbiased estimator for the density of a sum of random variables
that are simulated from a computer model. A numerical study on examples with
copula dependence is conducted where the proposed estimator performs favourably
in terms of variance compared to other unbiased estimators. We provide
applications and extensions to the estimation of marginal densities in Bayesian
statistics and to the estimation of the density of sums of random variables
under Gaussian copula dependence
Unbiased and Consistent Nested Sampling via Sequential Monte Carlo
We introduce a new class of sequential Monte Carlo methods called Nested
Sampling via Sequential Monte Carlo (NS-SMC), which reframes the Nested
Sampling method of Skilling (2006) in terms of sequential Monte Carlo
techniques. This new framework allows convergence results to be obtained in the
setting when Markov chain Monte Carlo (MCMC) is used to produce new samples. An
additional benefit is that marginal likelihood estimates are unbiased. In
contrast to NS, the analysis of NS-SMC does not require the (unrealistic)
assumption that the simulated samples be independent. As the original NS
algorithm is a special case of NS-SMC, this provides insights as to why NS
seems to produce accurate estimates despite a typical violation of its
assumptions. For applications of NS-SMC, we give advice on tuning MCMC kernels
in an automated manner via a preliminary pilot run, and present a new method
for appropriately choosing the number of MCMC repeats at each iteration.
Finally, a numerical study is conducted where the performance of NS-SMC and
temperature-annealed SMC is compared on several challenging and realistic
problems. MATLAB code for our experiments is made available at
https://github.com/LeahPrice/SMC-NS .Comment: 45 pages, some minor typographical errors fixed since last versio
Federated Variational Inference Methods for Structured Latent Variable Models
Federated learning methods enable model training across distributed data
sources without data leaving their original locations and have gained
increasing interest in various fields. However, existing approaches are
limited, excluding many structured probabilistic models. We present a general
and elegant solution based on structured variational inference, widely used in
Bayesian machine learning, adapted for the federated setting. Additionally, we
provide a communication-efficient variant analogous to the canonical FedAvg
algorithm. The proposed algorithms' effectiveness is demonstrated, and their
performance is compared with hierarchical Bayesian neural networks and topic
models
Graph Neural Network-Based Anomaly Detection for River Network Systems
Water is the lifeblood of river networks, and its quality plays a crucial
role in sustaining both aquatic ecosystems and human societies. Real-time
monitoring of water quality is increasingly reliant on in-situ sensor
technology. Anomaly detection is crucial for identifying erroneous patterns in
sensor data, but can be a challenging task due to the complexity and
variability of the data, even under normal conditions. This paper presents a
solution to the challenging task of anomaly detection for river network sensor
data, which is essential for accurate and continuous monitoring. We use a graph
neural network model, the recently proposed Graph Deviation Network (GDN),
which employs graph attention-based forecasting to capture the complex
spatio-temporal relationships between sensors. We propose an alternate anomaly
scoring method, GDN+, based on the learned graph. To evaluate the model's
efficacy, we introduce new benchmarking simulation experiments with
highly-sophisticated dependency structures and subsequence anomalies of various
types. We further examine the strengths and weaknesses of this baseline
approach, GDN, in comparison to other benchmarking methods on complex
real-world river network data. Findings suggest that GDN+ outperforms the
baseline approach in high-dimensional data, while also providing improved
interpretability. We also introduce software called gnnad
A PAC-Bayesian Perspective on the Interpolating Information Criterion
Deep learning is renowned for its theory-practice gap, whereby principled
theory typically fails to provide much beneficial guidance for implementation
in practice. This has been highlighted recently by the benign overfitting
phenomenon: when neural networks become sufficiently large to interpolate the
dataset perfectly, model performance appears to improve with increasing model
size, in apparent contradiction with the well-known bias-variance tradeoff.
While such phenomena have proven challenging to theoretically study for general
models, the recently proposed Interpolating Information Criterion (IIC)
provides a valuable theoretical framework to examine performance for
overparameterized models. Using the IIC, a PAC-Bayes bound is obtained for a
general class of models, characterizing factors which influence generalization
performance in the interpolating regime. From the provided bound, we quantify
how the test error for overparameterized models achieving effectively zero
training error depends on the quality of the implicit regularization imposed by
e.g. the combination of model, optimizer, and parameter-initialization scheme;
the spectrum of the empirical neural tangent kernel; curvature of the loss
landscape; and noise present in the data.Comment: 9 page
Continuously-Tempered PDMP samplers
New sampling algorithms based on simulating continuous-time stochastic processes called piecewise deterministic Markov processes (PDMPs) have shown considerable promise. However, these methods can struggle to sample from multi-modal or heavy-tailed distributions. We show how tempering ideas can improve the mixing of PDMPs in such cases. We introduce an extended distribution defined over the state of the posterior distribution and an inverse temperature, which interpolates between a tractable distribution when the inverse temperature is 0 and the posterior when the inverse temperature is 1. The marginal distribution of the inverse temperature is a mixture of a continuous distribution on [0,1) and a point mass at 1: which means that we obtain samples when the inverse temperature is 1, and these are draws from the posterior, but sampling algorithms will also explore distributions at lower temperatures which will improve mixing. We show how PDMPs, and particularly the Zig-Zag sampler, can be implemented to sample from such an extended distribution. The resulting algorithm is easy to implement and we show empirically that it can outperform existing PDMP-based samplers on challenging multimodal posteriors
- …